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ABSTRACT 

Regulation of SMN2 exon 7 splicing is crucial for the 
production of active SMN protein and the survival of 
Spinal Muscular Atrophy (SMA) patients. One of the 
most efficient activators of exon 7 inclusion is hn- 
RNP G, which is recruited to the exon by Tra2-pi. 
We report that in addition to the C-terminal region 
of hnRNP G, the RNA Recognition Motif (RRM) and 
the middle part of the protein containing the Arg- 
Gly-Gly (RGG) box are important for this function. 
To better understand the mode of action of hnRNP 
G in this context we determined the structure of its 
RRM bound to an SMN2 derived RNA. The RRM in- 
teracts with a 5'-AAN-3' motif and specifically recog- 
nizes the two consecutive adenines. By testing the 
effect of mutations in hnRNP G RRM and in its puta- 
tive binding sites on the splicing of SMN2 exon 7, we 
show that it specifically binds to exon 7. This interac- 
tion is required for hnRNP G splicing activity and we 
propose its recruitment to a polyA tract located up- 
stream of the Tra2-pi binding site. Finally, our data 
suggest that hnRNP G plays a major role in the re- 
cruitment of the Tra2-pi /hnRNP G/SRSF9 trimeric 
complex to SMN2 exon 7. 

INTRODUCTION 

Spinal Muscular Atrophy (SMA) is an inherited disease 
characterized by degeneration of the spinal cord a-motor 
neurons, which results in a system- wide muscle wasting (1). 
SMA is considered one of the most frequent genetic causes 
of infantile death with an incidence rate of 1 in 6000 (2,3). 
It is caused by the genetic homozygous inactivation of Sur- 
vival of Motor Neuron-1 (SMNl) gene (4). In humans, an- 
other copy of the SMN gene (SMN2) is present, which is 
nearly identical to SMNl with only five nucleotide substi- 
tutions including a silent cytosine to thymine (C to T) mu- 
tation at position +6 of exon 7 (5). This point mutation in- 
activates an Exonic Splicing Enhancer (ESE) resulting in 
exon 7 skipping in most SMN2 transcripts (6,7). This spHc- 



ing isoform encodes an unstable truncated form of the SMN 
protein (8). Consequently, the amount of functional SMN 
proteins produced from SMN2 gene is not sufficient to com- 
pensate for the absence of SMNl gene expression and main- 
tain functional motor neurons (8,9). Importantly, all SMA 
patients have at least one intact copy of the SMN2 gene 
in their genome (3,5). Acting on the splicing regulation of 
SMN2 to favor the inclusion of exon 7 is therefore a promis- 
ing strategy to increase the cellular level of functional SMN 
proteins and develop a treatment for SMA patients (1,10- 

11). 

Several positive and negative regulators of SMN2 exon 7 
splicing have been identified (1). One of the most efficient 
activators of exon 7 inclusion is the protein hnRNP G (12). 
It was proposed that hnRNP G is recruited to the SMN2 
pre-mRNA by its interaction with another splicing factor 
named Tra2-pl (12-14). The structure of Tra2-pl RRM 
bound to RNA was recently determined by NMR showing 
that this factor recognizes specifically a 5^-AGAA-3^ motif 
within an ESE located at position +21 of exon 7 (14,15). 
Both proteins act in synergy and can activate exon 7 in- 
clusion to up to 80% when overexpressed simultaneously, 
a level that could not be reached when each protein was 
overexpressed separately (12,16). Understanding the mode 
of interaction of this heterodimer with RNA at the molecu- 
lar level would then facilitate the development of therapeu- 
tic methods that stabilize its binding to exon 7. 

HnRNP G belongs to the heterogeneous nuclear ribonu- 
cleoproteins family and is encoded by the RBMX gene lo- 
cated on the X chromosome (17). Although this protein is 
ubiquitously expressed (1 8, 19) its expression level is variable 
and tissue-dependent (20). hnRNP G has multiple func- 
tions. In addition to SMN2, it regulates the splicing of dys- 
trophin, as -tropomyosin (20), and Tau pre-mRNAs (21,22). 
Moreover, hnRNP G was shown to be involved in the tran- 
scription regulation of SREBP-lc (23,24) and GnRHl (25). 
Several reports have also linked hnRNP G to cancer as it 
suppresses tumor growth at least in part by up-regulating 
the transcription of the tumor suppressor Txnip (26,27) or 
by modulating apoptosis (28). Other functions of hnRNP 
G also include the regulation of the neural development of 
frog (29) and zebrafish embryos (30) and the extracellular 
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release of TNFRl, a receptor mediating the inflammatory 
actions of TNF (31). Finally, it was reported that hnRNP G 
is a regulator of sister chromatids cohesion (32) and that its 
expression is enhanced by p53 in response to DNA damage 
(33,34). 

HnRNP G is composed of an N-terminal canonical RNA 
Recognition Motif (RRM) followed by a succession of mo- 
tifs, namely an RGG box with three Arg-Gly-Gly repeats 
(17), a Nascent Targeting Domain (NTD) and a C-Terminal 
Domain (CTD) containing a SRGY box, Arg-Ser (RS) re- 
peats (35) and a second RNA binding site (C-RBD) (36) 
(Figure lA). Ahhough the C-RBD part of the CTD was 
proposed to bind a 5^-GGAAA-3^ capped stem-loop (36), 
the RRM is believed to be primarily responsible for the 
binding of hnRNP G to RNA (37). In the context of SMN2 
exon 7 sphcing regulation the CTD was reported to in- 
teract with Tra2-pi (12). To date, the specificity of RNA 
recognition by hnRNP G RRM remains elusive. This do- 
main shares 88% sequence similarity with the RRM of its 
paralogue in testis RBMY. The structure of RBMY RRM 
bound to RNA was determined and showed that it inter- 
acts specifically with 5^-CAA-3^ capped RNA stem-loops 
(38). Most residues involved in RNA recognition are con- 
served in hnRNP G suggesting that the protein could also 
recognize 5^-CAA-3^ containing sequences. However, Sys- 
tematic Evolution of Ligands by Exponential Enrichment 
(SELEX) experiments conducted with hnRNP G showed 
that its RRM binds single-stranded 5^-CCA-3^ or 5^-CCC-3^ 
containing sequences (37). Finally, another study suggested 
that hnRNP G could bind to a 5^-AAGU-3^ motif (20). 
These contradictory data prevent the prediction of putative 
binding sites for this protein and the correct understanding 
of its functions in cells. 

In this study, we determined the structure of hnRNP 
G RRM bound to RNA using NMR. The structure re- 
vealed that this domain specifically recognizes two consecu- 
tive adenines. This allowed us to identify a putative binding 
site for this protein on SMN2 and to propose a model in 
which hnRNP G binds specifically to exon 7 upstream of 
the Tra2-pi binding site. Finally, we identified the regions 
of hnRNP G that are required for its activity as a regulator 
of SMN2 exon 7 splicing. 

MATERIALS AND METHODS 

Expression and purification of the recombinant proteins 

Escherichia coli BL21 (DE3) codon plus cells trans- 
formed with pET28a::hnRNP G RRM (residues 1-95), 
pET24b::GBl-hnRNP G RRM + RGG (residues 1-127) 
or pET28a::Tra2-pl RRM (residues 106-201) (14) were 
grown at 37°C in M9 minimal medium supplemented with 
50 |xg/ml kanamycin, 50 |xg/ml chloramphenicol, 1 g/1 
^^NH4C1 and 4 g/1 unlabeled or 2 g/1 ^^C labeled glucose for 
^^N or ^^N and ^^C labeled proteins, respectively. Proteins 
were purified by two successive nickel affinity chromatogra- 
phy (Qiagen®) steps, as previously described (14), dialyzed 
against the hnRNP G NMR buffer (50 mM NaCl, 20 mM 
NaH2P04, pH 5.5) or the Tra2-pi NMR buffer (50 mM 
L-Arg, 50 mM L-Glu, 0.05% (B-mercaptoethanol, 20 mM 
NaH2P04 pH 5.5) (14) for hnRNP G and Tra2-pl recom- 
binant proteins, respectively. Concentration of recombinant 



proteins was carried out using 10-kDa molecular mass cut- 
off Centricons (Vivascience®). The absence of RNases was 
confirmed using the RNase Alert Lab Kit (Ambion®). 

ORE encoding full-length hnRNP G and Tra2-pi were 
cloned in pEX::SUMO vector (39) and expressed in 
MC1061 E. coli strain. The cells were grown in LB and in- 
duced at 18°C with 0.1 g of arabinose/1 of culture. The in- 
soluble fraction of lysed cells was dissolved in 6 M urea and 
the unfolded proteins were purified using Ni-NTA column 
(Qiagen®). The purified proteins were refolded by rapid di- 
lution (20 x) in the refolding buffer (880 mM L-arginine, 21 
mM NaCl, 0.88 mM KCl, 55 mM Tris, pH 8.2) and subse- 
quently dialyzed in the gel shift buffer (200 mM L-Arg, 200 
mM L-Glu, 0.05% p-mercaptoethanol, 20 mM Na2HP04 
pH 7). The proteins were then concentrated and used for 
gel shift experiments. 

Preparation of RNA-protein complexes 

All RNA ohgonucleotides were purchased from 
Dharmacon®, de-protected according to man- 
ufacturer's instructions, lyophilized and resus- 
pended in the corresponding NMR buffer. The 5^- 
GAGACAAAAUCAAAAAGAAG-3^ RNA was tran- 
scribed in vitro, purified by HPLC and resuspended in the 
Tra2-pl NMR buffer. 

The hnRNP G RRM-RNA complex was prepared in the 
hnRNP G NMR buffer at a proteiniRNA stoichiometric 
ratio of 1:1 in a final volume of 250 |jl1 and at a final con- 
centration of 1 mM. 

The NMR titrations performed with hnRNP G RRM 
were done in the hnRNP G NMR buffer (Supplemen- 
tary Figures S4 and S5), while those performed in the 
presence of hnRNP G and Tra2-pi RRMs (Eigure 4 
and Supplementary Eigure S6A) or hnRNP G RRM 
+ RGG (Supplementary Eigure S7) were done in the 
Tra2-pi NMR buffer with RNA and protein concen- 
trations of 0.2 mM. In Eigures 4 and S6A, increasing 
amounts of ^^N-labeled Tra2-pl or hnRNP G RRMs 
were first added to the 5^-UCAAAAAGAAG-3^ or 5^- 
GAGACAAAAUCAAAAAGAAG-3^ RNA. After reach- 
ing saturation at stoichiometric ratio of 1 : 1 , ^^N-labeled hn- 
RNP G RRM was added in a stepwise manner to the Tra2- 
P 1 /RRM complex until a final stoichiometric ratio of the 
three components of 1 : 1 : 1 . 

NMR experiments 

All the NMR spectra were recorded in the hnRNP G NMR 
buffer except titrations with Tra2-pi RRM, which were 
recorded in the Tra2-pi NMR buffer. Experiments were 
recorded at 313 K using Bruker AVIII-500 MHz, 600 MHz, 
700 MHz, Avance-900 MHz equipped with a cryoprobe, 
and AVIII-750 MHz spectrometers. Topspin 2.1 (Bruker®) 
was used for data processing and Sparky (http://www.cgl. 
ucsf edu/home/sparky/) for data analysis. 

Protein backbone assignment was achieved using 2D ^H- 
i^N HSQC, 3D HNCA, 3D CBCACONH and 3D HN- 
CACB, while side chain assignments were achieved using 
2D ^H-i^C HSQC, 3D HcccoNH TOCSY, 3D hCccoNH 
TOCSY, 3D NOESY ^H-^^N HSQC and 3D NOESY ^H- 
^^C HSQC aliphatic. Aromatic protons were assigned using 
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Figure 1. Study of the interaction of hnRNP G with its proposed binding site on SMN2 exon 7. (A) Schematic representation of hnRNP G domains 
composition. The RRM is located at the N-terminus followed by an RGG domain, a NTD, and a CTD. The CTD comprises a SRGY motif and a C- 
RBD. Amino acids located at the boundaries of each domain are numbered. The sequence of the C-RBD is shown with RS repeats marked in bold. (B) 
Schematic representation of the SMN2 minigene containing exon 6 to exon 8 with intermittent introns. The sequence of exon 7 is shown. The Tra2-pi 
binding site is in red and the previously proposed hnRNP G binding site in bold (14). The RNA sequence tested for the interaction with hnRNP G RRM 
is underlined. The sequence in blue indicates another putative hnRNP G binding site. (C) Overlay of ^H-^^N Heteronuclear Single Quantum Coherence 
(HSQC) spectra recorded during NMR titration of the ^^N labeled hnRNP G RRM with increasing amounts of the unlabeled 5'-AUCAAA-3' RNA. The 
titration was performed at 40° C in the hnRNP G NMR buffer. The peaks corresponding to free and RNA-bound protein states (RNA:protein ratios of 
0.3:1 and 1:1) are blue, orange and red, respectively. Negative peaks corresponding to amides of arginine side chains in the free and RNA-bound (1:1 ratio) 
forms are green and orange, respectively. Black arrows indicate highest chemical shift perturbations observed upon RNA binding. (D) Representation of 
the combined chemical shift perturbations (A5 = [((5HN)^ + ((5N/6.51)2]^/^) of hnRNP G amides upon binding to the 5'-AUCAAA-3' RNA at a ratio 
of 1 : 1 as a function of hnRNP G residue numbers. The corresponding secondary structure elements are represented at the top of the graph. The highest 
chemical shift perturbations annotated in (C) are indicated. 
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2D iR-iR TOCSY and 3D NOESY ^H-i^C HSQC aro- 
matic (40). 

RNA resonance assignments in complex with hnRNP G 
RRM were performed using 2D ^H-^H TOCSY, natural 
abundance 2D ^H-i^C HSQC and 2D ^^c IF-filtered 2F- 
filtered NOESY (41) in 100% D2O. Intermolecular NOEs 
were obtained using 2D ^H-iR NOESY and 3D ^^c IF- 
edited 3F-filtered HSQC-NOESY (42) in the presence of 
unlabeled RNA and ^^N- and ^^N-^^C-labeled proteins, re- 
spectively. 

All NOESY spectra were recorded with a mixing time of 
150 ms, the 3D TOCSY spectrum with a mixing time of 
17.75 ms and the 2D TOCSY with a mixing time of 60 ms. 

Structure calculation and refinement 

AtnosCandid software (43,44) was used to generate prelim- 
inary structures and a list of automatically assigned NOE 
distance constraints for hnRNP G RRM in complex with 
RNA. Peak picking and NOE assignments were performed 
using 3D NOESY (^^N- and ^^C-edited) spectra. Addition- 
ally, intra-protein hydrogen bond constraints were based on 
hydrogen-deuterium exchange experiments on the amide 
protons. For these hydrogen bonds, the oxygen acceptors 
were identified based on preliminary structures calculated 
without hydrogen bond constraints. Protein dihedral angle 
constraints were generated by TALOS (45). 

Seven iterations were performed and 80 independent 
structures were calculated at each iteration step. Struc- 
tures of the protein-RNA complexes were calculated with 
CYANA (43) by adding the manually assigned intra- 
molecular RNA and RNA-protein intermolecular distance 
restraints. For each cyana run, 50 independent structures 
were calculated. These 50 structures were refined with the 
SANDER module of AMBER 9.0 (46) by simulated an- 
nealing in implicit water using the ff99SB protein force field 
(47). The 20 best structures based on energy and NOE vio- 
lations were analyzed with PROCHECK (48). Figures were 
generated with MOLMOL (49). 

Isothermal titration calorimetry 

Isothermal titration calorimetry (ITC) experiments were 
performed on a VP-ITC instrument (Microcal®). The 
calorimeter was cahbrated according to the manufacturer's 
instructions. Concentrations of proteins and RNAs were 
determined using optical density absorbance at 280 and 
260 nm, respectively. Twenty micromolars of all the tested 
RNAs were titrated with 400 |jlM of hnRNP G RRM vari- 
ants. Both protein and RNA were in the same NMR buffer. 
The injection protocol used was 40 injections of 6 |xl ev- 
ery 5 min. All measurements were done at 40° C. Raw data 
were integrated and normalized for the molar concentra- 
tion. After subtraction of the reference data recorded in the 
absence of RNA (Supplementary Figure S3B), the resulting 
integrated data were analyzed using the Origin 7.0 software 
according to a 1:1 RNA:protein ratio binding model. 

Cell culture and plasmids 

HEK293 (human embryonic kidney) cells were cultured 
in Dulbecco's modified Eagle's medium (DMEM) supple- 



mented with 10% fetal bovine albumin (FBS). The pCI- 
SMN2 plasmid containing the SMN2 minigene was previ- 
ously described (16). The human hnRNP G ORE was am- 
plified by PCR and cloned in pcDNA3.1 mammalian ex- 
pression vector with an N-terminal FLAG tag (pcFLAG) 
to generate the pcFLAG-hnRNP G plasmid. The hnRNP 
G ARRM (90-391), Ac57 (1-334), A95-184 (1-95 fused 
to 184-391), A95-235 (1-95 fused to 235-391), and A95- 
250 (1-95 fused to 250-391) were amplified by PCR using 
pcFLAG-hnRNP G as a matrix and subsequently cloned 
in the pcFLAG vector. SMN2 and hnRNP G mutants were 
created by site-directed mutagenesis using specific primers. 

In vivo splicing assay 

One microgram of pCI-SMN2 [wild-type (WT) or mu- 
tant] was co-transfected with 1 (xg of pcFLAG-hnRNP 
G (WT or mutant) in HEK293 cells plated in six-well 
plates using calcium phosphate method. Total RNA was 
extracted 48 h after transfection and 1 [xg was then used 
for reverse transcription reaction using Ohgo(dT) and 
M-MuLV Reverse Transcriptase RNaseH" (Finnzyme®). 
One-tenth of the resulting cDNA was used for PCR 
amplification using a vector specific forward primer 
(pCI-fwd: 5^-GGTGTCCACTCCCAGTTCAA-30 and 
an SMN2 specific reverse primer (SMNex8-rev: 5^- 
GCCTCACCACCGTGCTGG-30. PCR products were 
then resolved on a 2% agarose gel. Same results were 
obtained when using ^^P labeled SMNex8-rev primer and 
resolving the PCR products on a 4% polyacrylamide gel. 
The bands corresponding to the products of the splicing 
reaction were then quantified using ImageQuant. Exper- 
iments were repeated independently three times allowing 
the calculation of the mean and standard deviation for 
each assay. 

Gel shift assay 

The RNA was dephosphorylated using Antarctic phos- 
phatase (NEB) and rephosphorylated with T4 polynu- 
cleotide kinase (NEB) in the presence of 7-^^P ATP. The 
protein-RNA complexes were formed in the gel shift buffer. 
Tra2-P 1 protein amounts were similar to what was used pre- 
viously (50). Five femtomoles of labeled RNA were mixed 
with increasing amounts of protein in a final volume of 6 
|jl1 and incubated for 30 min on ice. The resulting RNA- 
protein complexes were subsequently resolved on 6% native 
polyacrylamide gel at 4°C. 

RESULTS 

Structure determination 

We previously proposed a model suggesting that Tra2-pi 
could recruit hnRNP G to SMN2 exon 7 upstream of its 
5'-AGAA-3' binding site (14). Among all the RNA motifs 
proposed to interact specifically with hnRNP G (14,20,37), 
only a 5^-CAA-3^ sequence located upstream of the Tra2- 
pi binding site could be identified as a putative hnRNP 
G binding site (Figure IB). As hnRNP G RRM was pre- 
viously shown to be primarily responsible for the binding 
of the protein to RNA (37), we purified a recombinant 
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protein containing the first 95 amino acids of human hn- 
RNP G (Figure lA) fused to an N-terminal His-tag. We 
then tested its interaction with the 5'-AUCAAA-3' RNA 
sequence, which corresponds to its putative binding site on 
SMN2 exon 7 (Figure IB). NMR titration of hnRNP G 
RRM with increasing amounts of this RNA showed that 
they interact together (Figure IC). Saturation was reached 
at a stoichiometric ratio of 1 : 1 and the protein resonances 
experienced fast to intermediate exchange throughout the 
titration steps. Mapping of the chemical shift perturbations 
observed during this titration revealed that residues from 
the (3 -sheet and the C-terminal region were primarily af- 
fected upon RNA binding (Figure ID). 

To characterize this interaction at the atomic level, we de- 
termined the structure of hnRNP G RRM bound to the 5^- 
AUCAAA-3^ RNA using 2395 distance restraints derived 
from Nuclear Overhauser Effect (NOE) (Supplementary 
Table SI). The binding interface was characterized by 97 
intermolecular NOEs (Supplementary Figure SI and Ta- 
ble SI). The precision of the structure was high, with an 
Root-Mean-Square Deviation (RMSD) of 0.68 A for all 
heavy atoms of the twenty lowest energy conformations rep- 
resented in the final ensemble (Figure 2A). 

Structure of hnRNP G RRM in complex with RNA 

The structure shows that the RRM adopts a canonical 
PiaiP2p3a2p4 fold (51) with the RNA lying as a single 
strand on the (B-sheet surface (Figure 2B). In addition to the 
p -sheet surface, the C-terminal region of the RRM partic- 
ipates directly in the interaction with the RNA. The strong 
HI -H2^ correlations observed in a 2D total correlation 
spectroscopy (TOCSY) experiment indicate that all riboses 
of the RNA adopt a C2^-endo conformation. 

The three adenines located at the 3^ end of the RNA se- 
quence are contacted by the RRM, but only adenines 4 and 
5 are specifically recognized (Figure 2B). The base of A4 
stacks on the ring of Phel 1 located within the RNP2 motif 
and is specifically recognized by three hydrogen bonds (Fig- 
ure 2C). Two involve the side chains of Lys80 and Glu82 
while the third one is formed with the backbone amide of 
Thr85 (Figure 2C). The base of A4 adopts an unusual syn 
conformation, which is most likely induced by the presence 
of two aromatic groups located in the RNPl motif, namely 
Phe51 and Phe53 (Figure 2B and C). In an anti conforma- 
tion, the base would probably experience a steric clash with 
the rings of these aromatic residues. The base of A5 is sand- 
wiched between Phe53 located within the RNPl motif and 
Pro87 from the C-terminal region and is specifically recog- 
nized by two hydrogen bonds formed with the side chain 
of Lys9 and the backbone carbonyl oxygen of Thr85 (Fig- 
ure 2D). Additional contacts that are not involved in the 
specific recognition of the adenines but rather stabilize the 
protein-RNA interactions are also observed. A hydrogen 
bond is formed between the backbone amide of Ser88 and 
the phosphate group of A5, and explains the downfield shift 
observed for the Ser88 amide proton upon RNA binding 
(Figures IC and 2D). Moreover, Phe51 forms hydrophobic 
contacts with the riboses of A4 and A5 (Figure 2C and D). 
In contrast to A4 and A5, A6 is not specifically recognized 
as the base forms only hydrophobic contacts with Phe89 



located in the C-terminal region of the RRM. Finally, the 
side chain of Arg49 primarily forms a hydrogen bond with 
the phosphate group of A6 (Figure 2B). However, this side 
chain seems to be flexible as in some structures of the en- 
semble it rather forms a hydrogen bond with the phosphate 
group of A5 . 

To verify the importance of these interactions, we mu- 
tated to alanine most of the residues involved in RNA bind- 
ing individually and tested the effect of these mutations on 
the RNA binding affinity of the RRM using ITC and NMR 
titrations. We verified by NMR that none of these point mu- 
tations affects the global fold of the RRM (Supplementary 
Figure S2). In agreement with the structure, all the muta- 
tions tested significantly decreased the affinity of the do- 
main for the 5^-AUCAAA-3^ RNA (Supplementary Figures 
S3C and S4B). The ITC data could be fitted in the presence 
of the WT protein and an apparent equihbrium dissocia- 
tion constant 'K^' of 18 |jlM was calculated from the curve 
(Supplementary Figure S3 A). In the presence of the protein 
mutants, the affinity becomes too low to allow the fitting 
of recorded ITC data and estimate a (Supplementary 
Figure S3C). However, the decrease in affinity was unam- 
biguously confirmed by NMR titrations performed with the 
K80A and F89A protein variants, as chemical shift pertur- 
bations were shorter and experienced fast instead of inter- 
mediate exchange (Supplementary Figure S4B). A similar 
effect was observed when one or both of the specifically rec- 
ognized adenines were mutated to cytosine (Supplementary 
Figures S3D and S5) strongly supporting the specific recog- 
nition of these two consecutive nucleotides by the RRM 
of hnRNP G. The importance of the non-specific interac- 
tion of hnRNP G RRM with the third nucleotide of the 5'- 
AAN-3^ motif was also validated by the decrease in affinity 
observed with the 5^-AUCCAA-3^ RNA, which still retains 
the two consecutive adenines but misses the last nucleotide 
(Supplementary Figures S3D and S5). Finally, we tested the 
binding affinity of hnRNP G RRM for the 5^-UAAGAC-3^ 
RNA, which contains flanking U and G nucleotides (under- 
lined) instead of C and A in the WT sequence. In agree- 
ment with our structure, replacing the unbound cytosine 
by a uracil and the non- specifically recognized adenine by 
a guanine did not affect the RNA binding affinity of hn- 
RNP G (Supplementary Figures S3D and S4A). Altogether, 
these data strongly support the intermolecular contacts ob- 
served in our structure and validate 5^-AAN-3^ (N is for any 
nucleotide) as the minimal motif required for a sequence- 
specific interaction of hnRNP G RRM with RNA. 

HnRNP G interacts specifically with SMN2 exon 7 using its 
RRM 

HnRNP G was previously shown to be recruited by Tra2- 
(31 to SMN2 exon 7 (12). As several 5'-AAN-3' putative 
binding sites are present around the Tra2-pi binding site 
(Figure IB), we investigated whether hnRNP G could in- 
teract specifically with these motifs. We co-transfected HEK 
293 cells with plasmids that encode different versions of the 
hnRNP G protein and an SMN2 minigene containing ex- 
ons 6-8 of SMN2 with the intermittent introns (16). After 
RNA isolation, we monitored the levels of skipping or in- 
clusion of exon 7 by RT-PCR. As previously reported (12), 
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Figure 2. Overview of the solution structure of hnRNP G RRM bound to the 5' -AUG A A A- 3' RNA. (A) Ensemble of the 20 lowest energy calculated 
structures fitted on the protein backbone and heavy atoms of the RNA. The protein backbone is shown in gray, with the G-terminus in orange. The heavy 
atoms of the RNA are colored yellow for carbon, blue for nitrogen, red for oxygen and orange for phosphorus. The unstructured first three nucleotides and 
the N-terminus (amino acids 1-7) are hidden. (B) A representative structure from the ensemble showing the RNA bound to hnRNP G RRM. The protein 
and the RNA are represented in ribbons and sticks, respectively. The side chains of amino acids involved in the interaction with the RNA are represented in 
green sticks. The color scheme is the same as in (A). The N and G termini are colored in blue and orange, respectively. Hydrogen bonds are represented by 
purple dashed lines. (C) Molecular recognition of A4 by hnRNP G RRM. Side chains of amino acids involved in the interaction are shown. Golor scheme 
is as in (B). (D) Molecular recognition of A5 by hnRNP G RRM. Representation is similar to (G). All figures were generated with MOLMOL (49). 



overexpression of hnRNP G full-length activated the inclu- 
sion of exon 7 from around 20% in the absence of ectopic 
hnRNP G expression to around 65% (Figure 3A, lanes 1 
and 2). Cell transfection with a truncated version of hn- 
RNP G lacking the RRM induced a strong decrease in exon 
7 inclusion to only 35% (Figure 3 A, lane 3), showing that 
this domain is required for the function of hnRNP G as a 
splicing regulator of SMN2. It strongly suggested that hn- 
RNP G interacts directly with the polyA tract located in 
exon 7 (Figure IB). To determine whether this interaction 
was specific, we tested the effect of mutations in hnRNP G 
that affect its ability to recognize adenines on SMN2 exon 
7 splicing. As shown in Figure 3A, all the mutations tested 
significantly decreased the level of exon 7 inclusion except 
the replacement of Phe53 with alanine, which had an ef- 
fect only when combined with the Lys9 mutation to alanine 
(Figure 3 A, lanes 4-7). The absence of effect observed with 
the F53A mutation can be explained by the fact that hn- 
RNP G is recruited to SMN2 by Tra2-pl (12). Indeed, this 
inter-protein interaction could compensate for the decrease 
in RNA affinity induced by the single mutation. Finally, 
the effects observed for the Fl lA mutation and the K9A + 
F53A double mutation were very close to a full truncation 
of the RRM (Figure 3 A, lanes 3, 5 and 7), indicating that 
the RRM of hnRNP G interacts with SMN2 specifically by 
recognizing two consecutive adenines. 



HnRNP G binds an A-tract located upstream of the Tra2-pi 
binding site on SMN2 exon 7 

As hnRNP G was shown to be recruited to SMN2 exon 7 
by Tra2-pi (12), we examined sequences located on both 
sides of the Tra2-P 1 binding site and identified three regions 
containing at least two consecutive adenines, two upstream 
(A11-A14 and A17-A20) and one downstream (A27A28) (Fig- 
ure IB). Mutations of adenines to uracils in the regions An- 
A12 or A17-A18 were previously tested and had a strong ef- 
fect on SMN2 exon 7 splicing (14). We then tested the effect 
of the A27 to U mutation, but could not detect any signifi- 
cant effect on exon 7 splicing when hnRNP G was overex- 
pressed (Figure 3B). These results strongly suggest that hn- 
RNP G binds to the adenine-tracts located upstream of the 
Tra2-pi binding site rather than downstream (Figure IB). 

To investigate whether hnRNP G RRM could bind to 
the A17-A20 site without inducing any steric hindrance with 
the RNA bound Tra2-pi RRM, we first investigated by 
NMR the interaction of each of these two RRMs with the 
SMN2 exon 7 derived RNA 5^-Ui5CAAAAAGAAG25-3^ 
(Figure IB). This RNA sequence contains the motif 5^- 
AGAA-3^ which was previously identified as the Tra2-pi 
binding site (14), and the closest upstream putative hnRNP 
G binding site 5^-AAA-3^ (Figure IB). As illustrated in Fig- 
ure 4, the chemical shift perturbations observed upon bind- 
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Figure 3. Effect of mutations in hnRNP G and the pre-mRNA on SMN2 exon 7 splicing. (A) RT-PCR gel shows the levels of exon 7 inclusion in SMN2 
mRNAs upon overexpression of WT or mutated versions of hnRNP G in HEK 293 cells. The positions of PGR products corresponding to SMN2 mRNA 
with or without exon 7 are indicated on the right of the gel. The graph is the result of at least three independent experiments. Error bars represent standard 
deviations. The negative control corresponds to the percentage of exon 7 inclusion in the absence of ectopic hnRNP G expression. (B) RT-PCR gel shows 
the effect of the GGA to UUU mutation in the potential hnRNP G binding site located downstream of the Tra2-pi binding site (Figure IB) on SMN2 
exon 7 splicing. Conditions are similar to (A). 



ing of Tra2-pl or hnRNP G RRM to the long RNA (5'- 
UCAAAAAGAAG-30 or to RNAs containing their sin- 
gle binding sites (5^- AAGAA C-3^ and 5^-AUC AAA -3^ for 
Tra2-pi and hnRNP G, respectively) were similar (Figure 
4A and B). This showed that both Tra2-pl and hnRNP G 
RRMs could interact with their expected binding sites (5^- 
AGAA-3' and 5'-AAN-3', respectively) in the context of the 
long RNA. Surprisingly, our data also show that hnRNP G 
RRM binds to the long RNA with a higher affinity than 
the short RNA tested {K^ of 0.6 |jlM instead of 18 |jlM) 
(Supplementary Figures S3A and S6B). In agreement, some 



chemical shift perturbations had a larger magnitude or ex- 
perienced an intermediate instead of fast exchange regime in 
the presence of the long RNA (Figure 4 A and B). This dif- 
ference in affinity is most likely the result of an avidity effect 
due to the presence of five overlapping 5^-AAN-3^ motifs in 
the long RNA 5^-UCAAAAAGAAG-3^ (Figure 4B), which 
could all be bound by hnRNP G in the absence of Tra2-P 1 . 

We then tested the binding of hnRNP G RRM to 
the complex formed by Tra2-pi RRM with the long 
5^-UCAAAAAGAAG-3^ RNA. At a Tra2-pl:hnRNP 
G:RNA ratio of 1:1:1, chemical shifts correspond to the 
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Figure 4. Selection of chemical shift perturbations observed upon titrations of Tra2-p 1 and hnRNP G RRMs with SMN2 exon 7 derived RNAs. (A) Close- 
ups on the titration of Tra2-p 1 and hnRNP G RRMs with the unlabeled 5'-AAGAAC-3' and 5'-AUCAAA-3' RNAs, respectively. The peaks corresponding 
to free and RNA-bound protein states (RNAiprotein ratios of 0.3:1 and 1:1) are blue, orange and red, respectively. The underlined sequences represent the 
nucleotides that are bound by each protein. On this diagram, red and blue ovals represent Tra2-pi and hnRNP G RRMs, respectively. (B) Close-ups on 
the titration of Tra2-pi and hnRNP G RRMs with the unlabeled SMN2 derived 5'-UCAAAAAGAAG-3' RNA containing binding sites of both proteins 
(underlined sequences). The color code is similar to (A) except that the 1 : 1 bound state is in cyan. The code for the diagrams is similar to (A). (C) Close-ups 
on overlay of ^ H-^^N HSQC spectra corresponding to the binding of ^^N labeled hnRNP G RRM and ^^N labeled Tra2-p 1 RRM to the unlabeled SMN2 
derived 5'-UCAAAAAGAAG-3' RNA containing the binding sites of both proteins (underlined sequences). Blue peaks represent the free proteins and 
green peaks represent the complex formed in the presence of Tra2-pi and hnRNP G RRMs with the RNA at a ratio of 1:1:1. To simplify the spectra, 
peaks corresponding to the protein represented by a gray oval on the diagrams are hidden. Full spectra are in Supplementary Figure S6A. (D) Close-ups 
on overlay of ^H-^^N HSQC spectra corresponding to the binding of ^^N labeled hnRNP G RRM and ^^N labeled Tra2-pi RRM to the long unlabeled 
SMN2 derived 5^-GAGAC AAAAU CAAAA AGAA G-3^ RNA containing the binding sites of both proteins (underlined sequences). Blue peaks represent 
the free proteins and red peaks represent the complex formed in the presence of Tra2-pi and hnRNP G RRMs with the RNA at a ratio of 1:1:1. Full 
spectra are shown in Supplementary Figure S6A. 
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RNA-bound states of both proteins (Figure 4C and Sup- 
plementary Figure S6A), showing that Tra2-P 1 and hnRNP 
G RRMs could both be accommodated on a single RNA 
molecule containing their adjacent binding sites. In addi- 
tion, the absence of additional chemical shift perturbations 
between spectra recorded in the presence of RNA bound to 
either each protein alone or to both proteins together indi- 
cates that the two RRMs do not interact together when they 
bind two adjacent binding sites (Figure 4B and C). 

Surprisingly, the comparison of the bound state of hn- 
RNP G on the 5^-UCAAAAAGAAG-3^ RNA in the pres- 
ence and absence of Tra2-pi reveals that the chemical shift 
perturbations of hnRNP G are smaller when Tra2-pi is al- 
ready bound to RNA (Figure 4B and C). This decrease in 
the RNA binding affinity of hnRNP G can be explained 
by the reduction of the number of its binding sites available 
when Tra2-pl is bound to the 5'-AGAA-3' motif Indeed, 
Tra2-P 1 occupies this motif with around 8-fold higher affin- 
ity {K^ = 2.25 |jlM) (14) than hnRNP G leaving only the first 
5^-AAN-3^ binding site available instead of five. However, 
upon binding to the 5^-UCAAAAAGAAG-3^ RNA/Tra2- 
pl complex, the chemical shift perturbations of hnRNP 
G become shorter than the ones observed with the 5^- 
AUCAAA-3^ RNA (Figure 4A,C). This effect is most hkely 
due to the close proximity between the binding sites of hn- 
RNP G and Tra2-p 1 , making the last 3^ adenine inaccessible 
for hnRNP G. 

To investigate whether this decrease in affinity could be 
compensated by hnRNP G/Tra2-pi interactions (12-14), 
we produced full-length versions of both proteins and tested 
their binding to the 5'-UCAAAAAGAAG-3' RNA using 
gel shift assays. As expected, we observe a shift correspond- 
ing to the binding of each single protein to RNA (Figure 
5A). However, no additional shift was observed in the pres- 
ence of the two proteins (Figure 5A). It indicates that the 
A17-A20 tract is not the binding site used by hnRNP G in 
the presence of Tra2-p 1 . 

We then investigated the importance of the most up- 
stream A-tract (A11-A14) by testing the binding of the full- 
length proteins to an extended SMN2 derived RNA 5^- 
GAGACAii AAA14UCAAAAAGAAG25 3^ As illustrated 
in the Figure 5B, a shift was observed in the presence of 
individual proteins and an upper-band appeared when the 
two proteins were added together, indicating that both pro- 
teins could be accommodated on the RNA (Figure 5B). 
In addition, an NMR titration performed with this RNA 
shows that in the presence of bound Tra2-pi RRM, the 
affinity of hnRNP G RRM is similar to what was ob- 
served with the short 5^-AUCAAA-3^ RNA (Figure 4A and 
D) and not reduced as in the presence of the shorter 5^- 
UCAAAAAGAAG-3' sequence (Figure 4C). All together, 
these results suggest that hnRNP G binds to the Ai 1 - A14 site 
on SMN2 exon 7, when Tra2-P 1 is bound to the A21GAA24 
motif 

Several domains of hnRNP G are important for the activation 
of SMN2 exon 7 inclusion 

Our results reveal that hnRNP G has a weak RNA binding 
specificity and affinity, emphasizing the importance of its 
recruitment by Tra2-p 1 to SMN2 pre-mRNA. It was previ- 



ously reported that the CTD part of hnRNP G is respon- 
sible for its interaction with Tra2-pi (12) (Figure lA). In 
particular, the C-RBD seemed to be important for this in- 
teraction as truncation of the last 41 amino acids signifi- 
cantly decreases hnRNP G affinity for Tra2-pi without af- 
fecting its SMN2 binding capacity (12). In agreement with 
the functional importance of this domain, full truncation 
of hnRNP G C-RBD (AC57) strongly decreased the per- 
centage of exon 7 inclusion in SMN2 mRNAs from 65 to 
30% (Figure 3 A, lane 8). Previous reports suggested that 
hnRNP G/Tra2-pi interaction could be mediated by the 
SRGY motif and several RS repeats located in the CTD of 
hnRNP G with one of the RS domains of Tra2-pl (14,35). 
However, mutation of all RS repeats located in the C-RBD 
and/ or the SRGY motif (Figure lA) to alanines did not 
induce significant reduction of exon 7 inclusion indicating 
that these motifs are not crucial for hnRNP G recruitment 
to SMN2 (Figure 3A, lanes 9-1 1). 

To investigate whether additional parts of hnRNP G were 
involved in SMN2 exon 7 splicing, we tested the impor- 
tance of the central region separating its RNA binding site 
(RRM) from its Tra2-pl binding site (the CTD). We pro- 
gressively truncated the central region of hnRNP G keeping 
the RRM fused to the last 207 (A95-184), 156 (A95-235) 
or 141 (A95-250) amino acids and tested the effect of these 
truncated proteins on SMN2 exon 7 splicing in cells. Unex- 
pectedly, truncation of only 89 amino acids in this region 
of hnRNP G resulted in a strong decrease of SMN2 exon 
7 inclusion (Figure 6). Interestingly, this region contains an 
RGG box (Figure lA), which, to our knowledge, was not 
reported before to be important for the function of hnRNP 
G in SMN2 exon 7 splicing. 

To investigate whether the RGG box could be involved 
in SMN2 exon 7 recognition, we produced a version of 
hnRNP G that contains both the RRM and the RGG 
box (residues 1-127) and tested its interaction with the 5^- 
AUCAAA-3' RNA. As illustrated in Supplementary Fig- 
ure S7, the same residues of the RRM are involved in RNA 
binding with and without the RGG box (Figure IC and 
Supplementary Figure S7) indicating that the mode of RNA 
recognition of the RRM is the same. In addition, we tested 
whether the presence of the RGG box could modulate the 
specificity of RNA recognition of the RRM. As observed 
with the RRM alone, very small chemical shift perturba- 
tions were observed with the suboptimal 5^-AUCCCC-3^ 
RNA sequence in the presence of the RGG box (Supple- 
mentary Figure S7). Finally, no chemical shift perturbation 
was detected from the RGG box in the presence of these two 
RNAs. These data show that the RGG box does not bind to 
the RNAs tested and does not change the specificity of in- 
teraction of hnRNP G RRM with RNA. In addition to the 
function of the RRM binding to RNA and the CTD inter- 
acting with Tra2-P 1 , the central part of the protein may then 
play a role either as a spacer between the protein domains 
or as a functional entity of hnRNP G. 

DISCUSSION 

hnRNP G interacts specifically with two consecutive adenines 

Our structure shows that hnRNP G RRM binds to a 5^- 
AAN-3^ motif in which the two consecutive adenines are 
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Figure 5. Gel shift experiments showing binding of full-length hnRNP G and Tra2-p 1 proteins to SMN2 derived RNA sequences. (A) Gel shift experiment 
showing the binding of full-length hnRNP G and Tra2-pi to the 5'-UCAAAAAGAAG-3' RNA. The amount of proteins in nanograms is indicated below 
each lane. Both proteins could bind the RNA separately but did not bind together. (B) Gel shift experiment showing the binding of increasing amount of 
full-length Tra2-pi (left) or hnRNP G (middle) to the 5'-GAGACAAAAUCAAAAAG AAG-3' RNA. The right panel shows the binding of both proteins 
to the RNA in the presence of 30 ng of Tra2-p 1 (indicated by an asterisk on the left panel) and increasing amounts of hnRNP G. The amount of proteins in 
nanograms is indicated below each lane. The round head arrow represents the shift corresponding to the binding of the two proteins to RNA, the pointed 
head arrow represents binding of single protein molecules to RNA and the square head arrow represents binding of a co-purified truncated version of 
Tra2-pi to the RNA. 



- lensl 



^ ^ hnRNP G 



80 
60 
40 
20 
0 



# # # 



hnRNP G 



Figure 6. Effect of truncations in the central part of hnRNP G on the splicing of SMN2 exon 7. Three truncations were performed in the region of hnRNP 
G located between its RRM and the CTD. The effect of these protein variants on SMN2 exon 7 splicing was then tested in cells using the same conditions 
as described in Figure 3. 



specifically recognized (Figure 2). This RNA binding speci- 
ficity was validated in vitro using ITC and NMR measure- 
ments (Supplementary Figures S3, S4 and S5) and in cells 
by splicing assays (Figure 3). In agreement with our re- 
sults, hnRNP G was proposed to bind specifically to the 5^- 
AAGU-3' RNA motif (20). Our data can also explain the 
absence of specificity reported by Hofmann and Wirth (12) 
when they investigated the interaction of hnRNP G with 
SMN2 as they tested the binding of hnRNP G to long RNA 
sequences containing two consecutive adenines (12,52). In 
addition, we show that hnRNP G can bind the 5'-CCC- 
3^ or 5^-CCA-3^ motifs selected by SELEX (37) (Supple- 
mentary Figures S3D and S5). However, our structure sug- 
gests that they would not be accommodated by the RRM 
as well as the 5^-AAN-3^ motif (37). Indeed, a cytosine re- 
placing the first recognized adenine would probably fail to 



form the same hydrogen bond network, as the smaller cyto- 
sine base would then be more distant from the side chains 
of Lys80 and Glu82 (Figure 2C). Similarly, a cytosine lo- 
cated in the second pocket would be further away from the 
Lys9 side chain preventing the formation of the hydrogen 
bond observed in the presence of an adenine (Figure 2D). 
In agreement with these predictions, our ITC and NMR 
data showed that the affinity of hnRNP G RRM for the 5'- 
AUC CCC -3^ and 5^-AUCCCA-3^ RNA sequences is lower 
than for 5^-AAN-3^ containing RNAs (Supplementary Fig- 
ures S3D and S5). Moreover, having the SELEX motif 5^ to 
the 5^-AAN-3^ sequence (5^-ACCAAA-3^ RNA) does not 
improve the RNA binding affinity (Supplementary Figure 
S5). In conclusion, we show that hnRNP G binds prefer- 
entially 5^-AAN-3^ motifs, but can also accommodate 5^- 
CCC-3^ and 5^-CCA-3^ motifs. 
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hnRNP G and its paralogues have different RNA specificities 

The mode of RNA recognition of hnRNP G RRM is rem- 
iniscent of what was observed in the structure of its par- 
alogue in testis, RBMY (38) as the two consecutive adenines 
are recognized in a fairly similar manner (Supplementary 
Figure S8A). However, their modes of interaction are not 
identical. RBMY was shown to interact with a stem-loop 
RNA by insertion of its P2-P3 loop in the major grove of the 
RNA helix, whereas hnRNP G binds only single-stranded 
RNA (38). hnRNP G was previously shown not to interact 
with a stem because the sequence of its P2-P3 loop is dif- 
ferent. An additional glutamate is inserted between Arg43 
and Thr44, and Ser45 is replaced with an asparagin (38). 

The binding of RBMY to a stem-loop RNA partially ex- 
plains some of the other differences observed between the 
two RNA-protein complexes. The involvement of G14 in 
the first base pair of the stem prevents the base from be- 
ing available to contact Phe88 located in the C-terminal re- 
gion of RBMY RRM (Supplementary Figure S8A and B). 
Formation of this corresponding interaction in hnRNP G, 
between A6 and Phe89 maintains the C-terminal region of 
the protein in a position that allows the formation of a hy- 
drogen bond between the backbone amide of Ser88 and the 
phosphate of A5, which could not be observed in the case of 
RBMY (Supplementary Figure S8A). Although these inter- 
actions are unspecific, they contribute to the RNA binding 
affinity of hnRNP G. This could compensate, even partially, 
for the inability of this protein to interact with the stem of 
a stem-loop RNA. 

Another major difference is the absence of interaction 
between hnRNP G and the cytosine located 5^ to the two 
recognized adenines, contrary to what was observed with 
RBMY (Supplementary Figure S8A). Indeed, no inter- 
molecular NOEs were observed between the cytosine and 
any of the hnRNP G RRM residues, suggesting that this 
nucleotide stays flexible (Supplementary Figure S8A). This 
was surprising because most residues involved in the recog- 
nition of the cytosine are conserved in both proteins (Sup- 
plementary Figure S9). This difference may originate from 
two interactions observed in RBMY involving Argl7 and 
Arg48, which form hydrogen bonds with the phosphate 
group of Cii and A12, respectively (Supplementary Figure 
S8A). These two hydrogen bonds possibly stabilize the cy- 
tosine by paying the entropic cost required for positioning 
the nucleotide. In hnRNP G, Argl7 is replaced by a threo- 
nine and Arg49 (the equivalent of Arg48 in RBMY) is not 
restrained by the interaction with the stem and therefore is 
oriented differently to interact with the phosphate groups of 
A6 or A5 (Supplementary Figure S8A). As a consequence of 
the flexibility of C3 base, the side chain of Lys80 becomes 
available in hnRNP G to form a hydrogen bond with the 
base of A4 pulhng the base downwards and preventing the 
formation of the hydrogen bond observed in RBMY with 
the backbone carbonyl oxygen of Gln83 (Supplementary 
Figure S8A). 

In conclusion, our data show that hnRNP G and RBMY 
use a similar but distinct mode of interaction with RNA. hn- 
RNP G-T, another paralogue of hnRNP G found in testis, 
was reported to interact with 5^-GUU-3^ containing RNAs 
(53), a sequence that is different from what was reported 



for RBMY (38) and what we report here for hnRNP G. All 
these results strongly suggest that in testis hnRNP G, hn- 
RNP G-T and RBMY may regulate splicing by interacting 
with different RNA sequences and structures. 

The RRM, an important domain for the function of hnRNP 
G 

Although the contribution of hnRNP G RRM was in some 
cases reported to be negligible for its activity as a splicing 
regulator (35,53), our results show that RNA recognition by 
hnRNP G RRM is important for the activation of SMN2 
exon 7 inclusion (Figure 3A). RRM truncation and mu- 
tation of residues involved in the recognition of two con- 
secutive adenines significantly reduced the splicing activ- 
ity of hnRNP G (Figure 3A). It strongly suggests that the 
specific interaction of hnRNP G with SMN2 exon 7 con- 
tributes to the selective recruitment of the Tra2-P 1 /hnRNP 
G heterodimer to RNA. It then extends the initial recogni- 
tion of a short 5^-AGAA-3^ motif by Tra2-pi to the longer 
sequence 5^- AAAAN NNNNN AGAA -3^ (the binding sites 
of hnRNP G and Tra2-pi are underlined). In addition, 
the weak specificity of interaction of hnRNP G with RNA 
allows its binding to different registers on the A-rich se- 
quences present upstream of Tra2-pi binding site. It in- 
creases its RNA binding affinity locally (Figure 4B) and 
probably facilitates the recruitment of the hnRNP G/Tra2- 
(31 heterodimer at this position before to restrict hnRNP 
G interaction with the A11-A14 tract. The enhancement of 
weak protein-RNA interactions by the repetition of several 
consecutive binding motifs was previously reported (54) 
and was observed in the context of PTB binding to pyrimi- 
dine tracts (55). 

Interestingly, the RRM was also shown to be important 
for the function of hnRNP G as a tumor suppressor. Two 
point mutations of residues located in this domain, namely 
K22R and G29D were reported to affect the tumor sup- 
pressive activity of hnRNP G (26,56). Surprisingly, these 
residues are not located on the RNA binding surface of the 
domain. They are both in the ai helix, an element that could 
rather be involved in intra or inter protein-protein interac- 
tions as reported for the human SR protein SRSFl (57,58). 
It shows that the RRM is an important domain of the pro- 
tein that contributes to hnRNP G functions using different 
modes of action. 

Tra2-pi and hnRNP G, a simultaneous versus competitive 
binding to RNA 

It was previously reported that hnRNP G and Tra2-|31 
sometimes have opposite effects on splicing (20). hnRNP 
G can sequester Tra2-pi and other SR proteins prevent- 
ing their access to RNA and thereby their activity on sphc- 
ing (35,53) (Figure 7A). Alternatively, hnRNP G and Tra2- 
(31 could also antagonize the binding of each other to 
RNA (20) (Figure 7A). This later mode of action would ex- 
plain the opposite effects observed for these two proteins 
on endometrial cancer progression (59). As illustrated in 
Figure 4C, we also observed a competition between these 
two proteins when both Tra2-pl and hnRNP G RRMs 
are bound to the SMN2 derived 5^-UCAAAAAGAAG- 
y RNA. In that case, the binding of Tra2-pi restricted 
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Figure 7. Models representing the competitive and cooperative binding modes of hnRNP G and Tra2-pi to RNA. (A) Two competitive binding modes 
of hnRNP G and Tra2-pi to RNA. hnRNP G either sequesters Tra2-pl through protein-protein interactions preventing its binding to RNA (54) (model 
on the left) or the two proteins compete together to bind the same overlapping binding site (20) (model on the right). (B) Model showing the assembly of 
hnRNP G, Tra2-pi and SRSF9 on exon 7 of SMN2 pre-mRNA (14). hnRNP G, Tra2-pi and SRSF9 are represented in blue, red and green, respectively. 
The model shows interaction of hnRNP G with the binding site identified in this study, using its RRM and positioning the CTD toward the 3' end of 
the bound RNA. Tra2-pi RRM binds to its previously identified binding site located downstream positioning its N-terminal RSI domain toward the 5' 
end of the bound RNA (14). This RNA binding induced proximity between the RSI domain of Tra2-pi and the CTD of hnRNP G probably favors their 
interaction. The RS2 domain of Tra2-pi is positioned toward the 3' end of the bound RNA and may interact with the RS domain of SRSF9 (14). 



the binding of hnRNP G to the first 5'-AAN-3' motif 
instead of the five overlapping registers it could utilize 
in the absence of Tra2-pi. Our data reveal that this ef- 
fect is due to overlapping of the motif 5^-AAN-3^ bound 
by hnRNP G with the previously identified motif 5^- 
AGAA-3' recognized by Tra2-pl (14,15) (Figure 7A) and 
by the 8-fold higher affinity of Tra2-pi for its binding 
site on SMN2 (14). Finally, we demonstrate here that 
Tra2-pi and hnRNP G can bind together a single RNA 
molecule that contains their recognition motifs adjacent 
to each other (Figure 7B) as shown with the long SMN2 
derived 5^-GAGACAAAAUCAAAAAGAAG-3^ RNA se- 
quence (Figures 4D and 5B). We also show that a spacing 
between the two binding sites is necessary, most hkely to 
allow protein-protein interactions between these two part- 
ners. Altogether, these data indicate that Tra2-pi and hn- 
RNP G can either compete for binding to RNA or interact 
simultaneously depending on the targeted RNA sequence. 

The mode of action of Tra2-pi and hnRNP G as activators 
of SMN2 exon 7 splicing 

Our data support a specific recognition of the SMN2 pre- 
mRNA by hnRNP G and Tra2-pl (Figures 3-5). We pro- 
pose that the presence of several 5^-AAN-3^ motifs up- 
stream of the Tra2-pi binding site facilitates the recruit- 
ment of hnRNP G to SMN2 by Tra2-pl and therefore in- 
creases the specificity and affinity of exon 7 recognition by 
the heterodimer (Figure 7B). Our structure shows that the 
C-terminal region of hnRNP G RRM positions the CTD 
downstream of the hnRNP G binding site (Figures 2B and 
7B). In agreement with our data, the CTD was previously 
shown to be responsible for hnRNP G interaction with 
Tra2-pi (12). The mode of interaction of hnRNP G with 
Tra2-pi still needs to be characterized but seems not to oc- 
cur via the RS repeats located in the CTD of hnRNP G 
(Figure 3 A). In addition, our data suggest that this Tra2- 
pi /hnRNP G interaction could partially compensate for 
the loss of hnRNP G binding to SMN2. Indeed, mutations 



in hnRNP G RRM that significantly decrease its binding 
to RNA or even a full truncation of the RRM had only 
a moderate effect on SMN2 exon 7 splicing (Figure 3A). 
In conclusion, we propose that Tra2-pi anchors the bind- 
ing of the heterodimer by positioning and stabilizing hn- 
RNP G binding to RNA, which in turn increases the affin- 
ity and specificity of recognition of SMN2 exon 7 by inter- 
acting with the A-tracts located upstream of the Tra2-pi 
binding site (Figure 7B). Interestingly, the 5^-AAN-3^ motif 
was found within eight nucleotides upstream and/ or down- 
stream of 85% of potential Tra2-pi binding sites selected 
by CLIP (60), strongly suggesting that this mode of Tra2- 
(3 1 and hnRNP G assembly on RNA could be more widely 
used. 

Although we better understand the assembly of these fac- 
tors on SMN2 exon 7, their modes of regulation of exon 
7 sphcing still need to be characterized. Interestingly, the 
two proteins, Tra2-P 1 and hnRNP G were identified as part 
of the supraspliceosome (61). One or both of these fac- 
tors could then play a role in the spliceosome recruitment 
or assembly by inducing protein-protein interactions. At 
least two domains of hnRNP G would still be available for 
spliceosome recruitment. The NTD located in the central 
region of hnRNP G was shown to be responsible for the in- 
teraction of hnRNP G with other proteins (36). In addition, 
we showed that the region of hnRNP G located between 
the RRM and the NTD (Figure 1 A) was important for the 
splicing of SMN2 exon 7 (Figure 6). This part of the protein 
contains an RGG box, which could mediate additional pro- 
tein interactions. Tra2-pi could also contribute to exon 7 
splicing activation by promoting the recruitment of spliceo- 
some components through one or both of its RS domains. 
Finally, the SR protein SRSF9 (SRp30c), which is also part 
of the supraspliceosome (61), was proposed to be recruited 
to SMN2 exon 7 by Tra2-pl (62). SRSF9 could therefore 
be involved as well in the recruitment of the spliceosome us- 
ing its C-terminal RS domain. Further investigation is now 
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needed to identify the exact contribution of these three pro- 
teins to SMN2 exon 7 sphcing. 
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